1 research outputs found
Efficient algorithms for scalable video coding
A scalable video bitstream specifically designed for the needs of various client terminals,
network conditions, and user demands is much desired in current and future video transmission
and storage systems. The scalable extension of the H.264/AVC standard (SVC) has
been developed to satisfy the new challenges posed by heterogeneous environments, as
it permits a single video stream to be decoded fully or partially with variable quality, resolution,
and frame rate in order to adapt to a specific application. This thesis presents
novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding
mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection
algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different
prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute
Difference (MAD) prediction model.
The proposed fast inter-frame and inter-layer mode selection algorithm is based on the
empirical observation that a macroblock (MB) with slow movement is more likely to be
best matched by one in the same resolution layer. However, for a macroblock with fast
movement, motion estimation between layers is required. Simulation results show that
the algorithm can reduce the encoding time by up to 40%, with negligible degradation in
RD performance.
The proposed hierarchical fast mode selection scheme comprises four levels and makes
full use of inter-layer, temporal and spatial correlation aswell as the texture information of
each macroblock. Overall, the new technique demonstrates the same coding performance
in terms of picture quality and compression ratio as that of the SVC standard, yet produces
a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode
selection algorithms, the proposed algorithm achieves a superior computational time reduction
under very similar RD performance conditions.
The existing SVC rate distortion model cannot accurately represent the RD properties of
the prediction modes, because it is influenced by the use of inter-layer prediction. A separate
RD model for inter-layer prediction coding in the enhancement layer(s) is therefore
introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB
or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy
is maintained to within 0.07% on average.
As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction
model for the spatial enhancement layers is proposed that considers the MAD from
previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction.
Simulation results indicate that the proposedMADprediction model
reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation